Financial Data Modeling using a Hybrid Bayesian Network Structured Learning Algorithm
نویسندگان
چکیده
In this paper, a group of hybrid incremental learning algorithms for Bayesian network structures are proposed. The central idea of these hybrid algorithms is to use the polynomial-time constraint-based technique to build a candidate parent set for each domain variable, followed by the hill climbing search procedure to refine the current network structure under the guidance of the candidate parent sets. Experimental results show that, the authors’ hybrid incremental algorithms offer considerable computational complexity savings while obtaining better model accuracy compared to the existing incremental algorithms. One of their hybrid algorithms is also used to model financial data generated from American stock exchange markets. It finds out the predictors of the stock return among hundreds of financial variables and, at the same time, the authors’ algorithm also can recover the movement trend of the stock return. DOI: 10.4018/jcini.2012010103 International Journal of Cognitive Informatics and Natural Intelligence, 6(1), 48-71, January-March 2012 49 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. & Blum, 2003; Griffiths & Tenenbaum, 2005; Griffiths & Tenenbaum, 2007; Pacer & Griffiths, 2011; Buchsbaum, Gopnik Griffiths & Shafto, 2011), social cognition (Baker, Tenenbaum, & Saxe, 2007), and some other cognitive problems. However, all the Bayesian network models used in these researches are static models, which only can represent an average situation during special time. This paper focuses on incremental Bayesian network structure learning algorithms, which can be used to dynamically describe the changes of the cognitive process in time. Over a decade of research in finding efficient incremental learning algorithms for Bayesian network structures has yielded quite a number of important results and computational algorithms (Buntine, 1991; Friedman & Goldszmidt, 1997; Lam, 1998; Lam & Bacchus, 1994b; Nielsen & Nielsen, 2007; Roure, 2004a, 2004b; Shi & Tan, 2007; Niinimaki, Parviaimen, & Koivisto, 2011; Pennock & Xia, 2011). It is well recognized, however, that these existing algorithms often suffer from high computational complexity which prevents them from solving complex and large-scale practical problems. In this paper, a hybrid learning template is proposed to overcome the computational complexity deficiencies, and a group of algorithms are developed based on the template. Our template consists of polynomial-time constraint-based techniques and the hill climbing search procedure to decouple the complexity into two smaller and less complex computations. The constraintbased techniques make best use of the information in the current network structure and newly arrived datasets to build compact candidate parent sets for domain variables. The hill climbing search procedure is then employed to refine the current network structure under the guidance of the candidate parent sets. Two kinds of constraint-based techniques are proposed. The first one builds candidate parent sets from a global view. It learns an undirected tree-shaped network structure, and then extracts candidate parent sets based on the undirected structure and the current network structure simultaneously. The second one builds candidate parent sets from a local view. It selects the most relevant variables to each domain variable as the candidate parents by using polynomial-time feature selection algorithms. Two kinds of hybrid incremental algorithms are developed based on the above two constraint-based techniques. An extensive comparative study of our algorithms, both analytically and computationally, against a wide cross-section of state-of-the-art incremental algorithms and some batch algorithms on classic benchmark datasets is carried out. To our knowledge, this is the first comprehensive study providing a valuable comparison of the existing algorithms for various datasets with different sample size. The results show that our hybrid algorithms, especially the one based on the global view, offer considerable computational complexity savings compared to the existing algorithms all the time, and some of them also obtain better model accuracy at the same time. To further inspect the computational complexity and the validation of our algorithm, the one based on the global view is also used to solve real-world stock return prediction problem. The stock return prediction, including both the future stock return value prediction and the future stock return movement trends prediction, has gained unprecedented popularity in financial market forecasting research in recent years (Avramov & Chorida, 2006a, 2006b; Banz, 1980; Basu, 1977; Fama & French, 1989, 1992; Jegadeesh, 1990; Jegadeesh & Titman, 1993; Keim & Stambaugh, 1986; Lettau & Ludvigson, 2001). Finding out predictors of the stock return exactly is the foundation of the above two tasks. Unfortunately, there are no good solutions for this problem till now. Our algorithm uses Bayesian network structures to search the predictors automatically, which provides a brand new way of solving this prediction problem. The rest of this paper is organized as follows: first, our hybrid incremental algorithms are introduced. The experiments are described afterwards, followed by a simple theory analysis. 22 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the publisher's webpage: www.igi-global.com/article/financial-data-modeling-usinghybrid/67794
منابع مشابه
The modeling of body's immune system using Bayesian Networks
In this paper, the urinary infection, that is a common symptom of the decline of the immune system, is discussed based on the well-known algorithms in machine learning, such as Bayesian networks in both Markov and tree structures. A large scale sampling has been executed to evaluate the performance of Bayesian network algorithm. A number of 4052 samples wereobtained from the database of the Tak...
متن کاملA Surface Water Evaporation Estimation Model Using Bayesian Belief Networks with an Application to the Persian Gulf
Evaporation phenomena is a effective climate component on water resources management and has special importance in agriculture. In this paper, Bayesian belief networks (BBNs) as a non-linear modeling technique provide an evaporation estimation method under uncertainty. As a case study, we estimated the surface water evaporation of the Persian Gulf and worked with a dataset of observations ...
متن کاملA Surface Water Evaporation Estimation Model Using Bayesian Belief Networks with an Application to the Persian Gulf
Evaporation phenomena is a effective climate component on water resources management and has special importance in agriculture. In this paper, Bayesian belief networks (BBNs) as a non-linear modeling technique provide an evaporation estimation method under uncertainty. As a case study, we estimated the surface water evaporation of the Persian Gulf and worked with a dataset of observations ...
متن کاملLearning Bayesian Network Structure Using Genetic Algorithm with Consideration of the Node Ordering via Principal Component Analysis
‎The most challenging task in dealing with Bayesian networks is learning their structure‎. ‎Two classical approaches are often used for learning Bayesian network structure;‎ ‎Constraint-Based method and Score-and-Search-Based one‎. ‎But neither the first nor the second one are completely satisfactory‎. ‎Therefore the heuristic search such as Genetic Alg...
متن کاملLearning Bayesian Network Structure using Markov Blanket in K2 Algorithm
A Bayesian network is a graphical model that represents a set of random variables and their causal relationship via a Directed Acyclic Graph (DAG). There are basically two methods used for learning Bayesian network: parameter-learning and structure-learning. One of the most effective structure-learning methods is K2 algorithm. Because the performance of the K2 algorithm depends on node...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJCINI
دوره 6 شماره
صفحات -
تاریخ انتشار 2012